HCL
Skip to main content  
 
   


SPRTechnote


Sametime server defensive fix for Notes instant messaging

Technote Number: 1230709


Problem:
Under very specific circumstances the SametimeĀ® server can receive incoming
requests at an extremely high rate from the NotesĀ® client. These incoming
requests must be resolved in order for instant messaging users to communicate
and share presence information. As a result of receiving these requests at an
extremely high rate, the Sametime server can become unresponsive as it consumes
system resources during the processing of these incoming messages. The user
will receive an error message indicating that they have been logged out of
Sametime, with no indication as to why the user was disconnected. Disconnected
users will be able to immediately reconnect to the Sametime community.
The Sametime servers' state of unresponsiveness may manifest itself as
out-of-memory errors or by disconnecting from the Sametime Mux (which is used
to route instant messages).

Symptoms of this problem can include:
The nlnotes process on the Notes client spikes to 100%. Logging off from Notes
client IM returns the CPU on the nlnotes process to normal.
The CPU on multiple processes on the Sametime server are pegged at above normal
rates that may eventually reach 100%.
Attempting to open a chat session with another client shows the message
"Initializing chat: Resolving User Name". This is an organization-wide outage.
Attempting to add someone to a contact list will take an inordinate amount of
time.
On iSeries this problem will appear as an abnormal termination of the StMux
task. You would probably not notice a CPU spike because the CPU has so much
processing power available.
The ST Resolve process can also crash.

In additon to the Cumulative Client Hotfix (CCH) which has been released for
Notes 6.5.5 and 7.0 clients (see technote #1206369), a server-side defensive
workaround has been developed for Sametime 6.5.1 FP1 and 7.0 and integrated
into Sametime 7.5. This workaround will detect Notes Clients that enter the
looping state and disconnect them from the Sametime Community. The user will
not receive an error message stating why they were disconnected. Disconnected
users will be able to immediately reconnect to the Sametime community.

To determine the clients that exhibit the looping behavior, the STMux Server
Application counts incoming messages sent from a client. If the count exceeds
the predefined threshold, the STMux will disconnect the client and an error
will be recorded in the Sametime Log (STLog.nsf).

The threshold is configured using the following sametime.ini flags. (Please do
not change these values, as these are our recommendation to detect a client
attack correctly ):
[Config]
VPMX_THRESHOLD_INTERVAL=10
VPMX_THRESHOLD_MSG_COUNT=10000
VPMX_THRESHOLD_SEND_MSG_COUNT=5000
VPMX_THRESHOLD_CREATE_MSG_COUNT=5000

To get relevant traces, please add the following flag to sametime.ini under
[Debug] section
VPS_AUTH_DEBUG=1
VP_LDAP_TRACE=1
VP_REG_TRACE=1
VPS_DEBUG_CHANNEL_MSG=1
VPS_DEBUG_CONFIG=1
VPS_DEBUG_COP=1
VPS_DEBUG_GATEWAY_MSG=1
VPS_DEBUG_LOGIN_MSG=1
VPS_DEBUG_OTM_MSG=1
VPS_DEBUG_SERVICE_MSG=1
VPS_DEBUG_STATS_MSG=1
VPS_DEBUG_USER_MSG=1
VPMX_CLIENT_THRESHOLDS_DEBUG=1
UCM_DEBUG=1
UCM_KERNEL=1
UCM_SELECT=1
UCM_NOTIFY=1
UCM_MESSAGES=1
VPHMX_PURE_HTTP_DEBUG=1
VPMX_TCP_DEBUG=1
VPMX_CNL_DEBUG=1
VPMX_MSG_DEBUG=1
VPMX_HTTP_DEBUG=1
VPMX_DEBUG=1
VPMX_ROUTING_DEBUG=1

This debug will record trace information into the STMux_*.txt file located in
the Domino Trace (default: \lotus\domino\trace) directory. Any client that is
disconnected will see an error code equal to 80000233. The Sametime Log
(stlog.nsf) will also record the disconnect with the reason "Client exceeds
threshold".



To obtain the Sever-side defensive fix, please open a PMR with IBM Technical
Support.

A client-side fix has also been put into Notes 6.5.6 and Notes 7.0.2. Refer to
the Upgrade Central site for details on upgrading Notes/Domino.

Excerpt from the Lotus Notes and Domino Release 6.5.6 MR fix list (available at
http://www.ibm.com/developerworks/lotus):
SPR# TPAE6K5JZG - Certain rare combinations of user actions can put the Notes
client in a state where it sends a continuous stream of name resolve requests,
potentially enough requests rapidly enough to overwhelm the Sametime server and
cause it to crash. One symptom of this problem is if the nlnotes process on
the Notes client spikes to 100%. Logging off from the Notes client instant
messaging returns the CPU on the Nlnotes process to normal. The best
indication of this problem is one (or more) Notes clients spiking to 100% CPU
usage and the ST server(s) slowing to a crawl at the same time.
More >





  Document options
Print this document
Print view

  Search
Search Advanced Search


  Fix list views

 RSS feeds   RSS
Subscribe to the fix list

  Resources
Using this database
View notices

  HCL Support
HCL Support


    About HCL Privacy Contact